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Abstract. We consider a basic <i-adic model for the scattering transform on the line. 
We prove L 2 bounds for this scattering transform and a weak L 2 bound for a Carleson 
type maximal operator (Theorem 1.4). The latter implies boundedness of c?-adic models 
of generalized eigenfunctions of Dirac type operators with potential in L 2 (H) . We show 
that this result cannot be obtained by estimating the terms in the natural multilinear 
expansion of the scattering transform (Proposition 5.1). 



1. Introduction 

It is widely understood that scattering transforms are non-linear variants of the one 
dimensional Fourier transform. Thus scattering transforms give nonlinear Fourier trans- 
forms of scalar or more generally matrix valued potentials F(x). For harmonic analysts 
this suggests to study the basic a priori estimates in Fourier analysis (such as for example 
Hausdorff Young inequalitites or estimates for the Carleson operator) in the case of the 
scattering transforms. This naturally leads to the study of the nonlinear Fourier trans- 
form for rough and slowly decaying potentials. Beals and Coifman || study in detail the 
case when the potential is (generic) in L l or in weighted spaces L 1 n L 2 with weights of 
the form (1 + \x\) m for suitable m. More recently, Christ and Kiselev ||, [|7| have proven 
analogues of the Hausdorff Young inequality and a maximal Hausdorff Young inequality 
for a scattering transform. This is an estimate for potentials in LP . Their result implies 
boundedness of eigenfunctions of one dimensional Schrodinger operators with potential in 
L p , 1 < p < 2 for almost all positive energies. By an extension by Simon of a theorem of 
Sch'nol |I3| this implies that the absolutely continuous spectrum of the Schrodinger oper- 
ator is supported on the entire positive half axis, see also jnj page 501. This implication 



was one of the motivations of Christ and Kiselev to study the maximal Hausdorff Young 
inequalities for the scattering transform. We propose to study the analogue of Carleson's 
theorem H or the sharper form by Hunt ||, see also |I| for a recent proof, for scattering 



transforms. This amounts to an L 2 endpoint of the results by Christ and Kiselev and 
would give boundedness of eigenfunctions of Schrodinger operators with potential in L 2 . 
The question of absolutely continuous spectrum for potentials in L 2 has been settled to 
the affirmative by Deift and Killip ||, but this is a weaker statement than the conjectured 
boundedness of eigenfunctions. 

Currently we are not able to decide whether the analogue of Carleson's theorem as 
stated below is true or false. The purpose of this article is to study a <i-adic model for 
this problem and prove a positive result for this model. 
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We restrict attention to one of the easiest cases of the scattering transform. Thus 
consider the special AKNS-ZS system (named after [l]] and [Tj|): 

^- = kJf(x) + q(x)f(x) (1) 
for the unknown function / : R — > C 2 where k G C is a spectral parameter, 

This can be read as eigenfunction equation for Dirac operators on the real line. More 
generally one can write the eigenfunction equation for Schrodinger operators on the real 
line in the framework of AKNS-ZS systems. This links to the work of Christ and Kiselev, 
but we shall not elaborate on this generalization. We remark that Conjecture |1.3| (as well 
as the other conjectures formulated below) would imply boundedness of solutions to ([I]) 
for almost every k G H. 

We shall assume that F is locally integrable and for simplicity compactly supported. 
Writing a(x) exp(-ikx) and b(x) exp(ikx) for the two components of / and assuming k is 
real we obtain the following equivalent ordinary differential equation 

G' = WG (2) 

where 

Since F is compactly supported, equation (g) forces G to be constant near — oo and near 
oo. Let G(— oo) and G(oo) denote these constant values. Imposing the initial condition 
G(— oo) = id, then standard existence theorems give a unique absolutely continuous 
solution satisfying (pj) almost everywhere. Thus we can define G(oo) to be the scattering 
transform of the potential F at the spectral value k. 

We have implicitly used that the differential equation (||D forces G to remain in the form 
stated in (|3]), if it is initially of that form. It also forces G to have constant determinant, 
which for our chosen initial condition is equal to 1. In other words, G takes values in the 
Lie group SU(1, 1), see also the discussion in Section |2|. 

To prove a priori estimates for the scattering transform, we need a notion of size for 
the matrices G. A natural size appearing in the L 2 theory of this scattering transform is 
A/log \ a\ where a is the upper left entry of the matrix G. Observe that this quantity is 
positive since |a| 2 — \b\ 2 = 1. 

The following are known analogues of standard estimates for the Fourier transform: 
Recall that a(oo) and a(x) for given x are functions in the parameter k which we have 
suppressed in the notation. 

Theorem 1.1. Riemann - Lebesgue estimate 

|| V ^og|a(oo)||| 00 <C r ||F|| 1 
Hausdorff Young estimate (1 < p < 2) 

Hv'log |a(oo)||| p / < Cp||F|| p 



A CARLESON TYPE THEOREM 



3 



Plancherel identity 



|Vlog|a(oo)||| 2 = -||F|| 2 (4) 



Maximal Riemann Lebesgue estimate 

|| sup v'log |a(z)|||z,°o( fe) < C\\F\\i 

X 

Maximal Hausdorff Young estimate (1 < p < 2) 

II sup ^/\og\a(x)\\\ LP ' {k) < C p \\F\\ p 

X 

The Riemann Lebesgue and maximal Riemann Lebesgue estimates follow easily from 
Gronwall's inequality, i.e., from applying operator norms to (|2|) and integrating the in- 
equality 

IIGH7IIGII < IIWII . 



Then one uses i/log \a\ ~ log \\G\\ for small values of a and log |a| ~ log \\G\\ for large 
values of a. We remark that in the L 1 theory one may view log \\G\\ as the more natural 
measure of the size of G than \/\og \ a\. 

The Hausdorff Young and maximal Hausdorff Young inequalities follow by the work 
of Christ and Kiselev 0,10. The Plancherel identity is a well known scattering identity. 
Variants of it appear in M and |TB|,[?]. For the convenience of the reader and to contrast 



it to our results in the <i-adic model we will sketch a proof in the appendix. Interestingly, 
while Plancherel gives the L 2 endpoint of the Hausdorff Young inequality, we do not known 
whether the constant C p in the Hausdorff Young inequality can be chosen uniformly as p 
tends to 2. 

The maximal version of Plancherel, which amounts to a scattering variant of Carleson's 
theorem, is not known. We state it as a conjecture 

Conjecture 1.2. Carleson-Hunt estimate 

II sup A/log|a(x)||| L 2 (fc) < C||F|| 2 . 

X 

A more modest conjecture is 
Conjecture 1.3. Weak type Carleson estimate 



\{k : sup y/\og\a(x)\ > \}\ < CA~ 2 ||F| 



Even more modestly one could conjecture that the function sup^, ^/log \a(x)\ is finite 
almost everywhere for F in L 2 (R). To make G well defined for this last conjecture which is 
formulated in terms of the scattering transform for arbitrary F G L 2 (R) one may replace 
the initial condition G{— oo) = id by G(0) = id. 

The main purpose of the current article is to give some supporting evidence at least 
for Conjecture |1.3| by proving a variant of it in a <i-adic model. The <i-adic model is 
obtained by replacing the exponential functions in (^), which are the characters on R, by 
characters of an infinite product of copies of Z(<i) for some integer d > 1. We call these 
groups Cantor groups. 
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From now on, x and k will denote non-negative real numbers. For almost all such 
numbers, we have unique expansions with base d: 

k — ^ ^ k n d , x — — ^ ^ x^d 

neZ neZ 

where k n and x n take values in 0, 1, ... , d—1 and they are zero for sufficiently large positive 
index n. Indeed, we shall make these expansions unique for all x and k by requiring each 
of them to have only finitely many non-zero entries whenever possible. 
Then we define a character function w on 1R^ x R(j~ as 

W (k,x) = ^Snez*" 1 -!-" ; (5) 

where 7 G C is some fixed primitive d-th root of unity. The exact choice of 7 is not 
important. Observe that the formally infinite sum in the exponent in ([5]) has only finitely 
many non-zero summands. 

Let F G L 2 (R + ). For every parameter k we consider the following initial value problem: 



d x G{k, x) = W(k, x)G(k, x) (6) 
G(k,0) = id 



where 

W{kiX s-i F(x)w(k,x 



F(x)w(k,x) 

By standard ODE theory this initial value problem has a unique absolutely continuous 
solution satisfying the ODE almost everywhere. 

We denote again by a(k,x) the upper left entry of G(k,x). The main theorem of this 
article is the following 

Theorem 1.4. Let d > 1 and F G L 2 (R + ), and let G be defined by (Qj. Then for almost 
all k G R + the limit 

G(k,oo) = lim G(k,x) (7) 

x— +00 

exists and satisfies the estimate 

poo 

\og\a(k,oo)\dk<C \F(x)\ 2 dx . (8) 



Moreover, 

\{k : sup |a(ife,x)| > A} I < CA-^I^FlU (9) 

X 

for all X > 0. Here as well as in (^) the constant C may grow polynomially in d but is 
independent of F and X. 

If F is real valued, then a special situation occurs in Theorem TA for d = 2: the matrices 
W(k, x) then are real valued and commute for different values of x. By simultaneously 
diagonalizing all these matrices one can decouple the two equations and obtain an ODE 
of the form G' = VG with 

( , \ _ ( F(x)w(k,x) 
v ^ x )-\ -F(x)w(k,x) 
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The solution at +00 of the corresponding initial value problem is 

K(M = f exp(F(*)) 

V exp(-F(x)) 

where F denotes the Walsh-Fourier transform (the Fourier transform with respect to 
the Cantor group with d = 2). In this special case Theorem |1.4j follows simply from 



the known Plancherel identity and Carleson's theorem for the Walsh Fourier transform 
@. This example shows nicely the connection of scattering transforms and the Fourier 
transform. 

We will outline the proof of dg) in Section |2|. The proof is based on certain swapping 
inequalities, which are discussed in detail in Section |3|. The proof of these inequalities 
seems to be a genuinely new ingredient in the ci-adic model as compared to the theory of 
the linear Fourier transform. In Section f| we prove (H), which then easily implies (0). 

Initially the authors had attempted to use multilinear expansions of the solutions to 
(0) to prove Conjecture |1.3| in the way Christ and Kiselev prove their results for p < 2. 



However, as was observed in the terms in this expansion do not satisfy reasonable 
bounds for F G L 2 (R). Since the purpose of the current article is to compare the d-adic 
to the continuous case, we prove a result (Proposition |5TT|) in Section ^| which shows that 
the multilinear terms in the <i-adic setting are equally badly behaved. This is the second 
new result of this article. 

In the appendix (Section [|) we sketch a proof of the Plancherel identity (^) in Theorem 



TTTj. We only know a proof of this identity using complex contour integration. This proof 
seems to not have the same flexibility as the proof in the <i-adic case which decomposes 
the scattering transform into its elementary pieces. This in a sense is the main reason 
why at this point we are unable to prove Carleson's theorem for the continuous scattering 
transform. 

The first author was supported by NSF grant DMS 0100796. The second author is a 
Clay Prize Fellow and is supported by a grant from the Packard Foundations. The third 
author was supported by a Sloan Fellowship and by NSF grants DMS 9985572 and DMS 
9970469. 



2. Proof of the Plancherel inequality (H) 

First we consider the case of compactly supported F. Thus, for fixed k, G(k, x) becomes 
constant for large x and the existence of the limit G(k, 00) is not in question. 
Recall that SU(1, 1) is the Lie group of all complex 2x2 matrices of the form 

(I I) ^ 

with determinant \a\ 2 — \b\ 2 = 1. This group is isomorphic to SX 2 (1R). Observe that 
W(k, x) is an element of the Lie algebra of SU(1, 1), and thus the solution to the initial 
value problem (^), which is well known to exist as an absolutely continuous function, 
takes values in SU(1, 1). Of course one can verify directly by an elementary calculation 
that the solution to @ has the form (|i~0"D and determinant 1 for all x, which is all we 
need from this brief discussion of Lie groups. 
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The following is an easy observation about breaking the ODE (|6]) into pieces along the 
x variable. For any interval to C R(j~ define the localized system 



d x G u (k, x) = Wu(k, x)Gu(k, x) (11) 
GUM)=id , 



where 

F(x)l UJ (x)w(k,x) 



W^(k } x) 



F(x)l w (x)w(k,x) 



Lemma 2.1. Let u\, u>2, . . . ,u n be adjacent intervals in ascending order, and let the union 
of these intervals be the interval uo. Then we have for all k > 0: 

i 

G u (k,oo) = JjG Uj (A;,oo) = G a ,„(fc, oo) . . . G^k, oo)G Ul (fc, oo) . 

j=n 

Proof: By induction the lemma follows from the special case for two adjacent intervals 
u\ and uo 2 . Fix k. It is easy to check that the absolutely continuous function 

G W2 (k,x)G UJl (k,x) (12) 

satisfies the differential equation for G u (k, x) almost everywhere. This follows easily from 
letting xq be the point separating oj\ and u 2 and considering x < xo and x > xo separately. 
Since ( |T2"D also satisfies the correct initial condition, this proves the lemma. 

■ 

Next, we claim that if u is a <i-adic interval, that means an interval of the form 

[d K n, d K {n + l)) 

with integers k and n > 0, then G^k, x) does not change much as k varies inside a <i-adic 
interval of reciprocal length d~ K . 

To make this claim precise, we define a tile to be a rectangle p = I x uo of the form 

[d K n, d K {n + 1)) x [d~ K l, d~ K {l + 1)) 

with integers k, n, I such that I, n > 0. 

Lemma 2.2. Let I x uj be a tile. Let ko be the left endpoint of I and let k be any point 
in I. Then there is an integer j = j(k) independent of x such that if 



G U) (/co, x) 

then 

G w {k, x) = 



a b 
b a 



a ^ 3 b 
b a 

In particular, the first entry a of G w (k,x) is independent of k as long as k £ I. 
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Proof: Assume the length of I is d K . Since is constant outside ou, it suffices to show 
the claim for x G to. We split w(k,x) into two factors as follows: 

W (k,x) = [^llu<^uX-l-u^(^Y,u>K k " X -i-^ _ 

Observe that if x varies in lu, then the first factor in this splitting does not change. 
Likewise, the second factor is constant for k G I. Thus there is a j depending on k G I 
such that 

w(k t x)l u (x) = 7 J w(fc ,a;)l t(J (a;) . 



Now let T be the constant matrix 



y 
1 



Then 

W u (k,x)=TW u (k ,x)T- 1 . 
By conjugating the initial value problem (pTTJ ) by F we observe that 

G w (k,x) = FG u (k , x) r _1 . 

This proves the lemma. ■ 

Motivated by this lemma we shall define for a tile p = I x u: 

G p = G u (k , oo) 

where fco is the left endpoint of /. Next, we shall investigate the relation of the matrices 
G p for nearby tiles p. Here we mean by nearby tiles that the tiles are contained in a given 
d-adic rectangle of area d. 

Define a multifile to be a rectangle P — I x u of the form 

[d K n, d K {n + 1)) x [d l ~ K l, d}~ K {l + 1)) 

with integers k, n, I and I, n > 0. There are d tiles p^, j = 0, . . . , d — 1 contained in P of 
the form I x Uj. We shall always assume the ujj are ordered in ascending order. We call 
these tiles the horizontal subfiles of P. Moreover, there are d tiles qj, j = 0, . . . , d — 1 
contained in P of the form Ij x to. We shall again assume the Ij are ordered in ascending 
order, and we call these tiles the vertical subfiles of P. 



Pd-i 



Po 



qo ■■■ Qd-i 



Lemma 2.3. Let P = I x uo be a multitile and assume its horizontal tiles are po, . . . ,Pd-i 
and its vertical tiles are qo, ... , qd-i- If 



a,- b 
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for j — 0, . . . , d — 1, then 

i/ere i/ie product is to be read in descending order 



Proof: Let denote the left endpoint of I and let oP be the length of /. Let pj = I xuj 



By Lemma 2.1 it suffices to prove for < m < d — 1: 



a,(/.- + ^- 1 /»..x)= ( 7 _ J ) . (13) 



However, we have for x G uf 



U](k + d K ~' 1 m,x) = (jZv <K -l^*-l-v}ry™3(ryE v > K kuX-l- V j 

= j mj w{k, x) 

Now (|13|) follows by the considerations in the proof of Lemma pT2 . 



In the next section we will obtain a function /? : SU(1, 1) — >■ Rq such that 

i log H< /3(G) <C log |a| 
for some constant C depending polynomially on ci and, with the notation of Lemma [2.3| , 

d-1 d-l 

J2KG Pm )<dJ2KG qj ) • (14) 

m=0 j=0 

We will refer to (3 as the swapping function and (|14]) as the swapping property. Assume 
for now this swapping function has been constructed. The rest of this section is to prove 
(§) using this function. 

Let K be a large integer and consider the rectangle R = [0, d K ) x [0, d K ). Let p K denote 
the set of all tiles I x u G R with |J| = d K . One can partition the tiles in p K into d-tuples 
such that each <i-tuple consists of the horizontal tiles of a multitile. Applying (|H ) on each 
tuple we obtain 

^ G v) ^ d E ^ ■ 

pep K pep K+ i 

By iterating this we obtain 

dr K Y, ^°p) ^ E P( g p) ■ 

pep-K p&pk 

Hence 

d~ K log i a pi - CdK E log i a pi ( 15 ) 

P&P-K P&PK 

where a p denotes the upper left entry of G p . Observe that since we have no control 
over K it is very important that there are no further constants on the right hand side 
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of ( p.3|) other than the constant d which is the natural scaling constant (as we will see 
momentarily). 

We may assume that the support of F is contained in [0,0^). Then for a tile p = 
I x [0, d K ) we have that a p is equal to a(k, oo) where k is the left endpoint of I, or, by 
Lemma |2.2| where k is any point in I. Thus the left hand side of flT5f) is equal to 

d K 

log \a(k, oo)| dk . 

Thus it remains to show that the right hand side of (|15|) is less than 

C\\F\\\ 

for arbitrarily large K and constant C independent of K. Observe that (|TT| ) implies 

d 

I 1 1 1 OP — II ^Kl) 1 1 Op 1 1 1 1 Op 

which implies together with the intitial condition for G at 0: 

POO 

||G w (fc,oo)||op < exp( / || Wallop) . 
Jo 

This is Gronwall's inequality and - as has been mentioned before - implies the L 1 estimates 
claimed in Theorem Continuing the present considerations we obtain 

log \\G u (k, oo)\\ op < \\F\\ L i H . 

We claim that the operator norm of a matrix of the type (|T0D is equal to \a\ + \b\. This is 
clear in the case that a and b are real, in which it is easy to calculate the eigenvalues of 
the symmetric matrix. The general case can be obtained by multiplying the matrix from 
both sides by unitary diagonal matrices to reduce to the real case. 
By Holder's inequality we thus have 



logflocl + |6 W |) < \\F\\ 2 \uj 



1/2 



By choosing u small enough (K large enough), the right hand side can be made small. 
Thus we can assume \a\ is close to 1 and b is close to 0. Then we obtain 



log(M + |6 U |) > -\b u \ > 

Hence 



^^log|a p |<4^ \\FX = 4\\F\ 



This gives the desired bound on the right hand side of (|T5|) and completes the proof of 
inequality (H) in the case of compactly supported F. 

If F is not compactly supported then we will show later that the limit (|7|) exists almost 
everywhere. Assuming this for now, then (0) follows by Fatou's lemma. 
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3. The swapping function 
In this section we will find for each d > 1 a function 

P : SU (1,1) -> R.^ 

such that (3(G) is comparable to log |a| and we have the inequality (|i~4|). We call this 
inequality a swapping inequality and (3 a swapping function because (0) swaps the vertical 
tiles to the horizontal tiles in a given multitile. 

The case d — 2 is particularly easy and we will do it first. The function (3 simply can be 
chosen to be the logarithm of the Hilbert Schmidt norm of G. Here we define the Hilbert 
Schmidt norm of a matrix G of the form (]T0|) to be 



\\g\\ hs = VW+W ■ 

Observe that for a near 1 we have 

log \\G\\ HS ~ log(|a| 2 + |6| 2 ) ~ \b\ 2 ~ log |a| (16) 
and for large a we have 

log||G||^ 5 ~log(2H 2 )~log|a| . (17) 
Thus (3(G) is comparable to log \a\. We write 

A~* ■= (A' 1 )* (18) 
The following lemma then says that the swapping inequality is true. 
Lemma 3.1. If A, B 6 SU(1, 1) then 

log \\AB\\ HS + log ||AB-*|| HS < 2 log \\A\\ HS + 2 log \\B\\ HS 



Write 



.4 



a b 
b a 



and 



Then we have 

_ ( ac + bd bc + ad \ AB~* — ( ac ~ ^d be — ad 
ybc + adac + bd)' \bc — adac — bd 

This gives 

\\ABf HS + = 2(\ac\ 2 + \bd\ 2 + \bc\ 2 + \ad\ 2 ) = 2^11^11511^ . 

Using the arithmetic mean-geometric mean inequality we obtain 

||^4-B||#s||AE? *\\hs < ll^lllfsll-^llHS • 
Taking logarithms proves the lemma. ■ 

We remark that the function (3(G) := log |a| does not satisfy the swapping inequality 
in general. This can be seen from choosing a,b,c positive and d purely imaginary in the 
above example. 
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Now consider d > 3. In this case one has to choose a more complicated swapping 
function. Indeed, in an appendix to this section we will sketch an argument that the 
logarithm of the Hilbert Schmidt norm does not satisfy the required swapping inequality. 

Choose an e sufficiently small. For the purpose of keeping track of polynomial growth 
in the parameter d we remark that the choice 10 -3 <i -1 will be sufficient. 

Let r be the smallest positive number such that 

r 2 — r 3 = e 10 + e 20 arcsinh(r) . 
Then r is of the order e 5 . We consider the following swapping function defined on C 

P(z) := \z\ 2 - \z\ 3 

if | z | < r and 

/3(z) :=e 10 + e 20 arcsinh(|2|) 

if \z\ > r. 

Finding this function was inspired by the discussion of Bellman functions in ]T2| , whence 
the letter (3 for this function. For a matrix G of type fllCf) we will let /3(G) = (3(b). By a 
discussion as in ([Tfj) and fll7| ) it is clear that (3{G) is comparable to log \ a\ with constants 
growing polynomially in e" 1 and thus growing polynomially in d. 

Clearly there is not a unique way to choose (3. Our choice reflects in a very explicit way 
the two different types of behaviour for small and for large \z\ which will be apparent from 
the discussion below. Moreover, for each of the two regions our choice shows explicitly the 
leading order term (\z\ 2 and C _1 arcsinh(|z|)) and a smaller order correction term which is 
used to estimate the nonlinear effects. The third order correction term for small z could 
be replaced by any other power \z\ p with 2 < p < 4. 

Given d pairs (a i5 bi) of complex numbers with |dj| 2 = 1 + |5j| 2 and a d-th root of unity 
7 (in this section 7 shall not be a fixed primitive d-th root of unity but an arbitrary d-th 
root of unity) we define 

f A 1 \ _ yr ( _Oi_ J% \ 

V B, A, J' 11 ^ rb, al ) ■ 

The factors in this product do not commute, hence we emphasize that the product is 
understood in ascending order: 

/ _Oi_ 7 1 6i \ / _a2_ ^ 2& 2 ^ ( _^d_ l d b d \ . , 

V I'h al A 7 2 &2 cT 2 ) ■ { ^b d a~ d J ■ Uyj 

Also observe that in the last factor we have 7 — 1. 

Lemma 3.2. Under the above hypotheses, we have 

d 

J2P(B 7 )<dJ2P(k) 

7 i=l 

where the sum on the left hand side runs over all d-th roots of unity. 

This lemma clearly implies the desired swapping inequality fll4l) . 
Proof: We shall first consider the case when < r for all i. 
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Observe that we can write B 1 as a polynomial in the (for this matter viewed as inde- 
pendent) variables Oj, b i: W t and 6j for % — 1, . . . , d. 
We claim that this polynomial is odd in the vector 

b = (61,61, ...,b d , b d ) . 

This follows from the observation that the operation G — > G~* (see ([18]) and the lines 
thereafter) commutes with matrix products. Thus replacing the vector b by its negative 
replaces B 1 by its negative. Thus -B 7 has to be an odd polynomial in b. 

Writing down the matrix product explicitly, we observe that the polynomial _B 7 is a 
sum of monomials of degree d, where each such monomial has exactly one entry from 
each of the matrices G\, . . . , G^ as factor. Any choice of one entry from each matrix can 
appear in a monomial, provided the following row- and column conditions are satisfied: 
an entry from the j-th row of Gi can appear only if an entry from the j-th column of 
appears, the entry form Gj has to be from the second column, and the entry from 
G\ has to be from the first row. 

This together with oddness in b gives the crude estimate 

i.b 7 i < n n^i + - dr ^ + 2r ) d ~ i - 2dr ■ 

The right hand side is of the order e 4 and thus we are well in the range such that we have 
an estimate 

/3(B 7 ) < |5 7 | 2 - |5 7 | 3 . (20) 

Studying now the polynomial of £? 7 more carefully, we extract those terms which are 
linear in b. They are easily seen to be 

E7Mn%)^(n%) • ( 2i ) 

i=l \j<i / \j>i J 

Moreover, since there are no terms quadratic in b, we obtain the estimate 

B -y - ( ) b < ( 1 m] 

*=i Vi<* / \j>i 



< Yi i 6 <n 6 iii 6 *i n \ a i\+\ b i\<zd 3 \b m \\b m ,\ 2 . 

i<j<k lj^i,j,k 

Here m denotes the index such that \b m \ is maximal among all and m' denotes the 
index such that \b m i\ is maximal among all bi with % ^ m. 

Now we consider the polynomial for |-B 7 | 2 = -B 7 5 7 and sum over all 7. Observe that 
upon the summation in 7, all terms of the polynomial of |-B 7 | 2 which have a non-trivial 
power of 7 in the coefficient get canceled. 

We are again interested in the lowest order terms in b, which are the quadratic terms. 
Such terms appear when the i-th summand of is multiplied by the complex conjugate 
of the j-th summand in ([H]) . The power of 7 in such a term is trivial only if % = j. Thus 
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the (in b) quadratic terms of |-B 7 | 2 are precisely 
Moreover, from the previous discussion we can easily see the estimate 
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(23) 



i=i 



Now we use the fact that \aA 2 = 1 + \bA 2 to obtain 



J2\B,\ 2 -dY,h 



i=i 



< 6d r\b m .\\b m/ \ 



(24) 



Next, we observe 



3/2 



3/2 



> 



-6d 3 r|6 m ||& m /| 



j=i 



The last estimate followed from (p4[) and a trivial estimate on the slope of the function 
a; — > x 3 / 2 in the interval [0,1]. Now the right hand side of the last display is equal to 

\b m \ 3 fl + ^(N/IM) 2 ] -9d 3 r\b m \\b m ,\ 2 

>iM 3 (i + ^E(i^/i 6 -i) 2 ) 

> I 6 *' 3 + l\ b m\\bmi\ 2 ~ 9d 3 r\b m \\b m/ \ 2 > ^ |6i| 3 + ^IMIM 



9d r|&J|& m /| 



Together with (|2~4]) we obtain 

J2\ K /\ 2 - \ B i\ 3 ^ d J2\bi\ 2 - \bi 



Together with (|20|) this proves the Lemma |3^ in the present case \bi\ < r for all i. 

Now we consider the case at the other extreme that for at least two indices % we have 
\bi\ > r. Denote by I the set of indices i for which \bA\ > r. 

We observe that the quantity arcsinh|6| has the meaning of the logarithm of the operator 
norm of the matrix 

a b 
b a 

By elementary calculus using \a\ 2 = 1 + \b\ 2 this is equivalent to the statement that the 
operator norm of this matrix is \a\ + This however has been observed in Section |2|. 
In particular we obtain for each 7: 



arcsinh|5 7 | < ^^arcsinh|6j 
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Since by elementary calculus we always have 

/3(B 7 ) < e 10 + e 20 arcsinh|.B 7 



we obtain 



(3{B y ) < e 10 + e 20 ^ arcsinh^ 



< e 10 (l - |/|) + J2P(bi) + e 20 arcsinh 1 6, | 

iei i0 

iei iei 
Summing over 7 proves Lemma [T2] in this case. 

The same reasoning as in the previous case can be applied if there is only one index % 
such that \bi\ > r but there is at least one other index j such that \bj\ > er. The latter 
implies that 

e 20 arcsinh 1 6^1 < /3{bj) - e 15 . 
Namely, the left hand side is less than e 20 , while (3(bj) is at least e 14 . Thus we have 

/5(5 7 ) < fife) + /%) - e 15 + e 20 arcsinh|6 fc | 

< pfa) + p(b d ) . 

This proves the Lemma ^2 in the given case. 

It remains to prove the case when there is one index i such that > r and for all 
other indices j 7^ i we have \bj\ < er. We extract from (|22]): 

7 -^ 7 = h + J2 i j ~ ia ih + Yl i j ~^ b i + E 

j>i j<i 

with 

\E\ < 4d 2 (H + \bi\)\b m ,\ 2 < 8d 2 (l + \bi\)\b m ,\ 2 . 

Here ml is the index such that |6 m /| is maximal among all with j 7^ z. Observe that 
under the given assumptions the term bi is large compared to the linear terms in bj, j 7^ i, 
which in turn are large compared to E. Indeed, we observe the estimate 



< 2d(l + |& i |)|6 n 



j<i j>i 

Our goal is to make a Taylor expansion of the function / : z — > arcsinh \z\ near the 
point bi. Let A denote the linear form which is the derivative of / at bi and let p denote 
the quadratic form which is the second derivative of / at an appropriate point within 
distance 2d(l + |&i|)|& m '| of 6j. 

Then we obtain from Taylor's theorem 

arcsinh | B 1 \ = arcsinh 1 7 ~ 1 B 1 \ 

7 7 

= d arcsinhl&i | + ^ 7 i_i a i 6 i + ^ 7 i_i a7^ + E) + F 

7 3<i 3>i 
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with 

\f\ <j2\p\\J2^~ ta ^ + J2^~ l7F ^ +E \ 2 ■ 

7 3<i j>i 

Using that 

7 

for all j ' i we obtain 

arcsinh |i? 7 | = d arcsinh | b{ | + y^X(E) + F . 

7 7 

From elementary calculus we obtain 

|A|<(i + NT 1/2 

|p| <20|6 i |- 1 (l + l^| 2 )" 1/2 • 

Hence 

arcsinh | B 1 \ < c?arcsinh|6j| + 50r _1 <i 3 |& m | 2 

7 

and 

^2f3(B 7 ) < d(3(b { ) + e 20 50r- 1 rf 3 |6 m | 2 < d^[3(b 3 ) . 

7 j 

This proves the Lemma ^T|] in the last case and thus completes the proof. 

■ 

We close this section by showing that for d = 3 we cannot choose [3(G) to be log HG^^s 
if we want the swapping inequality (|L1| ) to hold. The rest of this section is irrelevant for 
the purpose of proving Theorem |1.4| . 

Let 7 denote a third root of unity. Define 

a b \ j a 76 
c d J y 7 2 c d 

We aim to find matrices A, B, C in SU(1, 1) such that 

Y[ WA^B^Ch > p|| 3 ||5|| 3 ||C|| 3 (25) 

7 

where the product on the left hand side goes over all three third roots of unity. We shall 
present such matrices A, B, C with real entries. Thus we write 

By homogeneity of fl2"5|) the requirement that A, B, C are in SU(1, 1) can be relaxed to 
the requirement that they have nonzero determinant and \a\ > \b\, \c\ > \d\, and |e| > |/|. 
Indeed we will produce an example satisfying the latter constraints and 

a 2 + 6 2 = l, c 2 + d 2 = l, e 2 + f 2 = l . 

In particular the right hand side of (p^) is equal to 1. 
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The matrix A 11 B 1 C is equal to 

A _ [ ace + l a df + ibde + j 2 bcf * 
I 7&ce + 7 2 M/ + 7 2 ade + ac/ * 

where the unspecified entries in the second column are the same as the diagonally opposite 
terms. 

We calculate the Hilbert Schmidt norm squared of this matrix, which is the sum of 
modulus sqared of the two indicated entries.. Observe that squaring the entries and 
multiplying out gives pure squares (the modulus square of a summand) and mixed terms 
(product of two different summands with proper complex conjugation). The pure squares 
simply add up to 

(a 2 + o 2 )(c 2 + d 2 )(e 2 + / 2 ) = l . 

To calculate the mixed terms it helps to observe that we may divide the second entry by 
7, then the two entries are alike but with a and b interchanged. Using a 2 + 6 2 = 1 and 
7 = 7 2 we obtain for the mixed terms 

+crfe/7 2 + 2a6cde 2 7 2 + 2abc 2 ef^ + 2abd 2 ef + 2abcdf 2 ^ 2 + cdef^ 2 

+cdef r y + 2abcde 2 ^ + 2abc 2 ef^ 2 + 2abd 2 ef + 2abcdf 2 ^ + cdefj . 

Now we set a = ab and (3 = ef. Thus it will suffice to produce a,j9 £ [0, 1/2]. Using 
e 2 + f 2 = 1 we obtain for the square of the Hilbert Schmidt norm 

(1 + Aa(3d 2 ) + (2a(3c 2 + 2acd + 2f3cd)^ + {2a(3c 2 + 2acd + 2f3cd)rf . 

Now we observe for any three numbers K, L, M the formula 

Y[ K + L 7 + M7 2 = K 3 + L 3 + M 3 - 3KLM . 

7 

Namely, expanding the left hand side, clearly the coefficients in front of K 3 , K 2 L, and 
K 2 M are 1, 0, 0; the latter two because the sum of the third roots of unity is 0. By 
multiplying each factor on the left hand side by 7 (7 2 ) we see that the left hand side is 
invariant under cyclic permutations of K, L, M. Thus it remains to check that the factor 
3 in front of KLM is correct. This however follows from letting K = L = M = 1. 
Now fix (3 > and choose a and d very small but nonzero such that 

2apc 2 + 2(3cd = . 

Let K = 1 + 2aj3d 2 + 2apd 2 , L = 2acd, M = 2acd. Then L 3 , M 3 , and 3KLM are small 
of order at least a 2 d 2 . However, 

K 3 = 1 + 12af3d 2 + 0{a 2 d 2 ) . 



Thus the left hand side of (25) can be made bigger than 1. 



4. Proof of inequality (||) 

This section is very close to the known existing proofs of Carleson's theorem in the 
classical linear case. We follow closely fCB] . For example Corollary fO] corresponds to 
a Bessel inequality in the linear case. In the current non-linear setting it is convenient 
to estimate the contribution of a single tree pointwise outside an exceptional set (in the 
spirit of the original proof by Carleson H) instead of using any LP estimate, because of 
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the ease of pointwise summing a geometrically decaying sequence using a quasi triangle 
inequality, see the calculation beginning with fl37l). 
We shall first assume that F is compactly supported. 

We are interested in the dependence of constants on d. It will help to introduce a 
constant V which (other than the constant C) does not change from line to line and has 
polynomial growth in d. The constants C in this section will be independent of d. 

If G is a matrix in SU(1, 1) and a is its first entry, we shall write 

\G\ := \a\ 

Choose T so that 

r _1 log|G| < (3{G) < riog|G| . 
By construction of (3 this constant grows polynomially in d. 

Orthogonality of disjoint tiles. We define a partial ordering on tiles by p < p' if / C V 
and u' C u. Recall that all intervals / and u are d- adic and assumed to be half open 
(containing the left but not the right endpoint). Therefore two such intervals are either 
disjoint or one is contained in the other. Since tiles have area one we conclude that two 
tiles are comparable if and only if they have non-empty intersection. 



We observe that we have the following corollary of Lemma 3.2 



Corollary 4.1. Let q be a finite set of pairwise disjoint tiles and let p be a finite set of 
tiles such that for all q G q we have 

qc[jp 

pep 

and for all q G q and p G p we have q < p whenever p and q have nonempty intersection. 
Then 

Y,\img q ) <y,\img p ) (26) 

Proof: 

If p has none or one element, then (|26| ) is trivial because q has to be a subset of p. Fix 
p, by induction we may assume the corollary has been proved for all subsets of p. Now 
choose q. By cancelling equal summands on both sides of (P5|) and using the result for 
subsets of p we may assume that q and p are disjoint. We may assume q is nonempty 
and choose q such that I = \I g \ is minimal. Since the possible values of I are discrete 
and bounded above by ^ Pgp \I P \, we may assume by induction that the statement of 
the corollary is true for all values of / larger than a given Iq, and we have to prove the 
statement under the assumption I = l . Now we use induction on the number n of tiles 
q in q which satisfy \I q \ = I. Again by induction we may fix an uq and assume that the 
statement is true for all n < and we have to prove the statement assuming n = n . 

Now pick a tile q G q such that \I g \ = I. It is the vertical tile of a multitile Q. We claim 

1-QcUpepP 

2. Any vertical tile in Q is either an element of q or it is disjoint from all tiles in q. 
Assuming these two claims for now, we observe that it suffices to prove the statement of 
the corollary for q' which is the union of q and the set of vertical tiles in Q. Observe 
that I' = I and n' < n + d — 1 where V and n' are defined analogously to I and n. By the 
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swapping inequality (|i~4|) it suffices to prove the statement for q" which is equal to q' with 
all vertical tiles of Q removed and all horizontal tiles of Q added in. Observe that I" > / 
and, if I = I", then n" < n. Thus the statement of the corollary follows by induction. 

It remains to prove the above two claims. To see the first claim, pick (k, x) G Q. There 
is a (ko,x) G q. Then there is a p G p with (k ,x) G p. Since p fl q ^ we have by 
assumptions on p that q < p. Thus I p is a <i-adic interval, strictly containing I q because 
of p 7^ q. By ci-adicity Iq C I p and hence (fc, x) G p which had to be proved. 

To see the second claim pick a vertical tile q' of Q and assume that q' fl q" ^ for some 
q" G q. By minimality of the choice of I q we have I q i C I q ». If this inclusion was strict, 
then q" fl q ^ which is impossible. Hence = I q " and hence g = q' . This proves the 
second claim and completes the proof of the corollary. 



Corollary 4.2. Let q be a set of pairwise disjoint tiles. Then 



Proof: This follows from the previous corollary by a limiting argument as in the proof 



Selecting trees. We define an ordering on multitiles analogous to the ordering on tiles. 
Thus P < P' for two multitiles P = I x uo and P' — V x J if / C V and w'cw. 

A set P of multitiles is called convex, if for any three multitiles P < P' < P" with 
P, P" G P we can conclude P' G P. 

An ordered splitting of a set P of multitiles is a decomposition of P into a disjoint union 



where is a subset of the integers (and possibly oo) and P G P„, P' G P n > with P < P' 
imply n < n' . Observe that if P is convex, then the components P n of an ordered splitting 
are again convex. 

A tree is a set T of multitiles which has a maximal element with respect to the ordering 
of multitiles. This maximal element is called the top of the tree and denoted by Pt- 

Each element P of a tree other than the top itself has a distinguished index jp G 
0, . . . , d — 1 attached to it such that pj p is the unique horizontal subtile of P which 
intersects the tree top. For the top Pt of a tree we define jp T = d — 1. Observe that we 
have suppressed the dependence of jp on the given tree in the notation. 

We define the size of a collection of multitiles by 




of ©. 





(27) 



where the sup is taken over all trees in P. For a given tree T the tiles pj occuring in the 
sum on the right hand side of are pairwise disjoint. 
The above Corollary [4.2| implies the following lemma: 
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Lemma 4.3. Let P be a convex set of multitiles. Then we can decompose P into an 
ordered splitting Pi U P2 such that 

size(P 2 ) < 2~ 4 size(P) (28) 
and Pi is the (not necessarily disjoint) union of a collection T of trees such that 

\*t\ < CTsizeiPy^lFg (29) 

TeT 

and each tree TeT with top Pt has the saturation property that if P < Pt for some 
P e Pi then P G T. 

Proof: Set a = 2 _4 size(P). 

We select recursively for n = 1, 2, 3 . . . a tree T n . Suppose we have already chosen T m 
for all m < n. If there is a tree T n in 

P n := P \ |J T rn 

with size larger than a, then we choose one such tree with top P n say such that the upper 
endpoint of Up n is minimal. The tree T n is then the maximal tree in P with respect to 
set inclusion with top P n . 

We iterate this tree selection until we reach an n = N such that there is no tree in P n 
with size larger than a. If this is the case, we stop the selection and define Pi to be the 
union of trees selected. Define P2 = P \ Pi- By maximality of each selected tree it is 
clear that the splitting of P into Pi and P2 is ordered and that each selected tree satisfies 
the saturation property of the lemma. Moreover, by the stopping condition for the tree 
selection it is clear that P2 satisfies the size estimate (j28|). 

It remains to prove the bound (|2"9"|). By Corollary [4. 1| it suffices to show that the set 
of tiles pj with P e T n for some 1 < n < N and j < jp is a set of pairwise disjoint 
tiles. Suppose to get a contradiction that pj < p'j, for two distinct such tiles. Then P 

belongs to a tree T n and P' belongs to a tree T n >. By d-adicity it is easy to see that the 
upper endpoint of u)p n is greater than the upper endpoint of Wp„ in particular n 7^ n' and 
n < n' . But the geometry of p n > qualifies it to be in the tree T n , which is a contradiction 
to the maximality of T n . 
This proves Lemma |Q. 



By iterating this lemma we obtain: 
Corollary 4.4. IfP is any finite set of tiles, we can decompose it into an ordered splitting 

P = Poo U |J P k 



kez 

such that 

size(P k ) < 2" 4k 
and Pk is the union of a collection Tk of trees such that 

\!t\ < Cr2 4k ||F||2 

T£T k 
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and each tree T G Tk with top P contains all elements P' G Pk with P' < P. Moreover, 
size(Poo) = 0. If the set P is convex, then all trees in Tk are convex. 

A John - Nirenberg type estimate for a single tree. Given a convex tree T, we 
shall be concerned with the following function defined on Ip 



M T (fc) 



sup log 

kei,i'-.icrci T 



n \\ G vA k ) ■ (3°) 

PdT-.IClpdl' j<j P 

Here the product is to be understood in the natural order of descending size of Ip and 
descending j: If \I P \ < \Ip>\, and j = jp > 0, f = jp> > 0, then the corresponding factors 
appear in the order 

• • • Gpi .,_ x Gpi . . . G p ' o . . . G Pj l G Pj 2 . . . G po ... . 



P" 







































P' 








p'l 

P'o 








p 




Po 





We have the John-Nirenberg type lemma: 
Lemma 4.5. Let T be a convex tree. Then for every integer fi > we have 

\{k G I T : M(k) > 4dr2 2 ^size(T)}| < 2~ C ^\I T \ 
for some small universal constant c. 

Remark: This inequality gives less decay in fi on the right hand side than the usual 
John- Nirenberg inequality. The loss is due to our approach to dealing with the quasi 
triangle inequality in (|3l|) instead of a triangle inequality. 

Proof: 

Observe that it suffices to prove the Lemma for large /i. 

Let Pt be the top of the tree and £ be the lower endpoint of the interval uop T . Observe 
that all intervals uo p . for pj appearing in the product in ([30D lie below £. Therefore we do 
not change the value of Mp if we restrict F to [0, £]. Therefore we shall assume for the 
purpose of proving this lemma that F is supported in [0, £]. 

It suffices to prove a similar estimate for the simpler variant 
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M T (fc)=suplog J] l[G p .(k) 
P.kel peT:ici P ci T j<jp 

This follows from writing 

n UG P] (k)=( n ugpm) ( n iw*) 

PeTJCIpCl' 3<jp \PeT:I'CIpCI T j<j P J \PeT:IClpCI T j<jp 

and estimating both factors on the right hand side by Mt- Namely, observe that for all 
G, G' E SU(1, 1) we have 

\G\ = {G' 1 ] 

and the quasi triangle inequality 

log|GG'| < 21og|G| + 21og|G'| . (31) 



The latter follows from Lemma 37T . Thus we obtain the pointwise estimate 

M T (k) < AM T (k) , 

which reduces the matter to estimating M?- 
We first prove for A > 1 the following estimate: 

\{k : M T (k) > rAsize(T)}| < A -1 |/r| • (32) 
This follows from the estimate 

| {A; : M T (k) > Asize(T)}| < A _1 |/t| • (33) 

for the modified function 



^tr(k) = sup (3 I J] H G pA k ) 

i-.kai y PeT:IcIpP _^ PTj<jp 

because (5(G) and log \ a\ are comparable by a factor of V. 

Let P be a multitile of the tree T, let q be a vertical tile in P, and assume k 6 I q . Then 
we have by support assumption on F, convexity of the tree, and Lemma |2.1| : 

o q (k)= n n G *(*) • 

P'£T:IpClp,,P'?PT3<j'p 

Namely, it is an elementary geometric observation that the intervals u p . on the right form 
a partition of the interval uj q fl (— oo, ^). 

Let q be the set of maximal tiles in the set of all tiles q which are vertical tile of some 
P E T and which satisfy 

P(G q ) > Asize(T) . 

Observe that the set estimated on the left hand side of (^) is contained in the union of 
I q with q E q. Furthermore observe that the union of all q E q is covered by the union 
of all P E T. Hence it is covered by the top multitile Pt and all horizontal tiles pj of 
multitiles P E T and j ^ jp. 
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An application of Lemma p.l| gives 

<?eq PeT,p^p T j<j P 

Here we have used that on the right hand side we do not have to include the terms with 
P = Pt or j > jp because they give zero contribution thanks to support assumption on 
F. 

This implies 

which proves (|33|) and therefore also (|32l) . 

Now we bootstrap ([32]) to the desired estimate for My. This step is analoguous to the 
bootstrapping argument that can be used to prove the usual John- Nirenberg inequality, 



which is why we say Lemma [4J| is of John-Nirenberg type. It suffices to prove: 

\{k : M r (Jfe) > 2^ +2 drsize(T)}| < 2^\{k : M T {k) > 2^dTsize(T) } | . (34) 
Namely, given this estimate, we have by iteration for every integer \l > 

\{k : M T {k) > 2 2 ^rsize(T)}| < 2~ c ^\{k : M T (k) > rfr S ize(T)}| < 2^ 2 |Jt| • 



This will prove the desired estimate. We prove 

Let P be the set of maximal multitiles in P 6 T such that 

max(log \G qi \, . . . ,log \G qd \) > 2 tM dTsize{T) 

where qi, . . . , qa are the vertical subtiles of P. For each such P let P' G T be the minimal 
multifile in T which is larger than P. (The proof trivializes if there is no such multifile 
because then P = P T .) Assume that Ip is equal to q'y 
Then, by maximality, 

log|GVJ < 2^rsize(T) 

and moreover, for every k 6 Ip, 



sup log 

I:IpCl 



n II G p'^ < 2^rsize(T) . 

P'eT:ICl pl ,I pl ^I T j<j pl 

Observing that every subtree of T has size at most size(T) and applying (|32|) to the 
subtree of all P" G T with Ipu C Ip and using the quasi triangle inequality (0) we 
obtain: 

\{k G I P : M T (k) > 2(T + 2^)rfrsize(T)}| < 2-"dr 1 \Ip\ . 

Since the intersection of \Ip\ and the set where Mk > <i M size(T) has measure at least 
d _1 |/p| (this argument is one of two places in the proof of @ where we lose a power of d 
in the dependence on d other than the loss due to T) we obtain: 

\{k G I T : M T {k) > 2" +2 dr S ize(T)}| < 2 ^ | { A: G I T : M fc > 2 M rfrsize(T)}| . 



This proves ( 34 ) and completes the proof of Lemma 



A CARLESON TYPE THEOREM 23 

The Carleson theorem. By restricting the set of tiles to those inside a large square 
[0, d K ) x [0, d K ) and a subsequent limiting argument as in Section || we may assume the 
set P of all multitiles is finite. 

We are aiming to show that there exists a C such that for each A > we have 

\{k : suplog|G(A;,x)| > Cd\- l }\ < r 2 ||^||lA . 

X 

Fix A. 



Decompose P into Pk according to Corollary [4.4| , where 

rsize(P k ) < 2" 4k 
and Pk is the union of a collection Tk of trees such that 

\b\ < CT 2 2 4k ||T|| 2 

T£T k 

We define an exceptional set E = \J E^. 

Let K be a negative integer of large modulus to be determined later. For k < K we 
define 

£ k = |J It 

TGT k 

and obtain 

|£ k | < CT 2 2 4k ||T|| 2 . 

For k > K we define 

E k = (J {k G I T : M T {k) > Ad2 2 ^- K h-^} . 

TGT k 

By the John-Nirenberg type Lemma we have for T £ Tk 

| {A; E I T ■ M T (k) > 4d2 2 ^- K) 2~ 4]l }\ < 2- c ^~ K)2 \I T \ . 

Thus 

|£ k | < CT 2 2- c ( k - A ') 2 2 4k ||F|| 2 = CT 2 2- c ( k -^ 2 2 4 ( k ^2 4A '||F|| 2 . 
If we choose K maximal with A > C2 AK for a certain C, then 



\E\ < l^k| < r 2 A||F| 



kez 

It remains to prove that for k ^ E and every x we have 

\og\G{k,x)\ < CdX- 1 (35) 
for some constant C. Fix x. We can write 

G(k,x)= J] Y[G^(k,x) 

P&Tj<j P 

where T is the convex tree of all multitiles P such that k G Ip and x G up and jp for 
P G T is the unique index such that x G j?j p . 

Let Tk be the intersection of T with P&. Since the sets Pk form an ordered splitting, 
the sets Tk are convex trees. Moreover, each Tk is contained in a tree Tk of Tk by the 
saturation property of the trees in T k . 
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Denote the top of Tk by Pk. Observe that if P G Tk and P 7^ Pk, then the number jp 
defined with respect to T is the same as the one defined with respect to Tk. 



Hence we can estimate the contribution of the tree Tk using Lemma |47| and the quasi 
triangle inequality as follows 



log 



n \\ g ^< 

P£T k j<j P 



< 2 log 



n g «*,(m) 



j<jp k 



+ 2 log 



n n g ^ 

P&T k ,P^P k j<j Pyi 



X 



< 2 log 



n 

j<jp k 



X , 



+ 2M f (k) . 



(36) 



Here pj in the first summand of (^Sj) and the preceding line is a horizontal tile of Pk- To 
estimate the first term we use that log |G| is comparable to (3(G) and the the swapping 
inequality to obtain 



x 



io § n 

i<jp k 

<mU G^(k,x)) 
j<jp k 

<dvJ2 P(G„ Pj )<dr 

3<3P^ 



lk 



The last inequality follows by observing that {Pk} constitutes a tree by itself which is 
controlled in size because P G Pk- Observe that in this argument we lose a factor d. 
Thus, by choice of x. 



log 



n 

P&T k j<j P 



X 



< cdT2 2 ^- K h- 4k . 



The trees Tk with k < K are empty, because x is not in the exceptional set. Moreover, 
by finiteness assumption on the set P there is a K' so that Tk is empty for k > K' and 
k 7^ oo. (The constant C is not allowed to depend on K') 

Then we have 



G(k,x) 



n u G ^ x n n n n^^^) 

\PeTcvjKjp J \K<k<K' PeT k j<j P 

where as usual the product has to be read in the correct order. The factor coming from 
Tqo can be discarded since it gives a unitary matrix. 
By the quasi triangle inequality we have 

log|G(Jfe,a;)| < ACd2~ AK . 
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Namely, we can prove inductively 

io §i n n ^ 

K"<k<K' P&T k j<j P 

<2io g | n iiG^k, x)\+2\o g \ n nn^Mi 

PeT K „ i<j P Jsr»+i<k<if' PeT k i<j P 

< 2Cd2 2 ^"^)2- 4 ^" + 1Cd1^ K "- K H^ K " 
< Cd2 2 « K "-V- K »2- 4 ( K "~V ; 

where contrary to our standing convention the constant C for induction purpose is the 
same in all appearances in this calculation. 

This proves inequality fl35|) and therefore completes the proof of inequality (§) in the 
case of compactly supported F. 

It remains to discuss the case of not necessarily compactly supported F G L 2 (R). We 
first prove that the limit (^) exists almost everywhere. It suffices to fix small e and prove 
that the limit exists for all k outside a set of measure e. This can be done by decomposing 
the positive real axis into intervals [0, xi), [#1,2:2), etc, such that the L 2 norms of the 
restrictions of F to the intervals ujj = [xj,Xj + \) decay very rapidly in j. This implies by 
the Plancherel inequality (§) that for k outside a small set (of size e/2) the values log \G UJj \ 
are still very rapidly decaying, so that one can use the triangle inequality 

\og\G 1 G 2 \ < log \G X \+ log ||G 2 ||^ (38) 

to show that the sequence log \G[o iX a | is a Cauchy sequence. Using (|9|) for for the restriction 
of F to each interval [xj,Xj + i) one can observe that the operator norms of the matrices 
G[ x -, x ) with Xj < x < Xj + i are small for large j and all k outside a set of measure e/2. 
Using the triangle inequality (|38D one can show that for k outside a set of measure e the 
limit of log |Gro a!)| exists. This proves existence of the limits in (0). Similar arguments as 
these make it straight forward to prove @ for arbitrary potentials F G £ 2 ([0, 00)). 

5. Multilinear expansions 
Writing the differential equation (^) as an integral equation 

we can use Picard iteration to obtain the formal solution 

Christ and Kiselev [f|,[[/J prove convergence for almost every k of this formal expansion 
if F G L P (R) with p < 2, and they use the expansion to show the maximal Hausdorff 
Young inequality (they work on a different model of the nonlinear Fourier transform, but 
their arguments apply to this case too). In JTT| it has been shown that the higher order 



terms of the Fourier analogue of this expansion are unbounded for F G L 2 and therefore 
not very well suited to be used to prove a nonlinear Carleson theorem. 

In this section we show that a similar discussion as in applies in the (i-adic setting 
provided d > 3. More precisely we will focus on the quadratic term in the above expansion 
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and on the case d = 3 and prove Proposition |5.1| below. The arguments generalize to d > 3 
and to the higher-linear terms, but we shall not elaborate on this because our main point 
can be made clear for this special case. 

The quadratic term in the above expansion is a diagonal matrix with entries 

Q{F){k,x)= I F(ti)w(k, tjFfoWk, t 2 ) dt x dt 2 (40) 

JU <U<x 



and the complex conjugate of 
We consider 

M(F){k) = sup 



F(h)w(k, t 1 )F{t 2 )w(k, t 2 ) dtidt 2 

t\<ti<x 



(41) 



The following proposition implies that there is no reasonable a priori bound for the size 
of this function in terms of the L 2 norm of F. 

Proposition 5.1. Let d = 3. There is an e > such that for each N > there is a finite 
d-adic step function F with L 2 norm 1 such that 

\\{k : M{F){k) > N}\\ > e . 

Proof: 



Pick a large integer N. In the proof of Lemma |2.2| we have seen that for a given tile 
p = I x io we can write 

w{k : x)li{k)l u {x) = Wp(k)w p (x) 

for some functions w p and w p which have constant modulus on / and 10 respectively and 
which we may assume to have L 2 norm 1. We observe for i£w: 



w p (k)w(k, x) dk = J Wp(k)w p (k)w p (x) dk = Wp(x) . (42) 

Thus, relying on the well known fact that the Cantor group Fourier transform is an 
isometry in L 2 (this can be shown by a linearized version of the arguments in Section 0), 
we see that the integral on the left hand side of ( |2 ) , which is the Walsh- Fourier transform 
of w p , has to vanish outside up and thus Wp is indeed the Cantor group Fourier transform 
of Wp. 

Consider the tiles 

Pj = I, x Uj := [3- N j, 3~ N (j + 1)) x [3 N j, 3 N (j + 1)) 

for j = 0, . . . , 3 N — 1 and set 

3^-1 3^-1 



F(x) : Yl F J { - r) := J2 ""'» (r] 



3=0 j=Q 



Thus ||F|| 2 = 1. 

Observe that the intervals Uj above form a partition of the interval [0,3 2N ). For k e 
[0, 1] let x(k) be the left endpoint of the unique interval Uj which contains 3 2N k. Consider 
for fcG [0,1): 



Q(F)(k)= / F{t l )w{kM)F{t 2 )w{k,t 2 )dt l dt 2 

't 1 <t 2 <x(k) 



A CARLESON TYPE THEOREM 27 

We shall show that the imaginary part of Q is large on a big set: 

{*e[0 J l):|9f(g(F)(*))|>iV/4}>l/3 , 

which will prove the proposition. Our argument works only for the imaginary part. 
Indeed, the real part of Q can be seen to satisfy good estimates. This is the reason why 
our argument works only for d > 3. If d = 2, then the characters w(k, x) are purely real 
and our argument does not work. We do not know whether for d — 2 the series (|3D| ) 
converges for genuinly complex F G L 2 in a reasonable sense. 
We may split 

Q{F){k) = I Fj(h)w(k, t 1 )F J -/(t 2 ) w (fc, t 2 ) dhdt 2 

0<j,j'<3 N -l ^*i<*2<a;(fc) 

If j > f then the integrand is zero on the domain of integration, thus we may disregard 
these terms. Likewise, if 3 / > x(k), the integrand is zero. If j < j' < 3~ N x(k), then 
the constraint t± < t% < x(k) in the domain of integration is superfluous and we can write 



Fj(t 1 )w(k, tx)F r {t 2 )w{k, t 2 ) dhdh 

h<t 2 <x(k) 



Fj(ti)w(k,ti) dti / F f (t 2 )w(k,t 2 )dt 2 



= w Vj (k)w Pj/ (k) = . 
Thus it only remains to consider the terms with j = j' < 3~ N x(k): 

Q{F){k)= [ Fj(h)w(k, ti)F 3 -(t 2 )w(fc, h) dt x dt 2 



0<j<3~ N x(k) 

For each t% < t 2 , t\,t 2 e ujj there is a minimal 3-adic interval u C ujj such that t\ and 
t 2 are both contained in u. Then t\ and t 2 are in different 3-adic subintervals out m ) and 
u)i m >) of the next smaller generation of uj . Indeed, m < m'. We split Q(F) according to 
the size of u as follows: 

Q(F)(k) = J2 E E E / F j {t 2 )w{kM)dt2 j Tjitjwfctjdh 

n<N 0<j<3~ N x(k) \w\=3 K 0<m<m'<2^ w (™) J "(m') 

Now fix k and j. Let the — K-th coefficient in the ternary expansion of k be k_ K and 
the — /c-th coefficient in the ternary expansion of 3~ N j be Jn-k,- Then we observe that for 
m = 0,1,2 



FAt 2 )w{k, t 2 ) dt 2 = f^-"^-« / Fj(t 2 )w(k, t 2 ) dt 2 



(m) JU(0) 



since there is a bijection of U(p) to oj(m) given by switching the k — 1-st coefficient of each 
element in ui(p) from to m. Moreover 



Fj(t 2 )w(k,t 2 )dt 2 \ = 3 

(m) 



-N+K-l 
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provided to C [3 j, 3 (j + 1)) and 3 j and k are in the same 3-adic interval iy) of length 
Otherwise the integral on the left hand side is zero. Thus 

3 Yl I F j (t 2 )w(k,t 2 )dt 2 j Tjit^wikM) dh (43) 



|cj|=3 k l<m<m'<3 " "{m) " "(m') 

7 



1/ (k)3~ N+K ~ 2 ^s - 1 if'-'-.'-./A ,.-] 



l<m<m'<3 

The imaginary part of the sum 



^(m-m')( fc K-JiV- K ) (44) 



Km<m'<3 



is equal to if k K — Jn-k = 0, it is equal to —^(7) if k K — Jn-k = 1 or —2, and it is equal 
to ^(7) if k K - j N - K = -1 or 2. 

Now we add the terms ( f4~3| ) over all < j < 3~ N k, i.e. we consider 



9 



Yl I Fj(h)w(k,t 2 )dt 2 ! Fj{t x )w(kM)dtx . (45) 

0<j<3- Jf i(fc) |w|=3 K 0<i<i'<2 Jwi Jul i' 

By the previous remarks it suffices to count the terms for which the contribution (|43|) 
is equal to — ^(7) and the number of terms for which it is equal to ^(7). If k is in the left 
third of a 3-adic interval of length 3~ K then we do not have a nonzero contribution from 
any j because the constraint 3~ N j < k and the constraint that 3~ N j and k are in the 
same 3-adic interval of length 3 _K implies that 3~ N j is in the left third of the same 3-adic 
interval of length 3~ K as k and thus jN-K = 0. By the discussion of fl44l) we therefore see 
that ( |45"D is equal to 0. 

If k is in the middle third of a 3-adic interval of length 3 _K , then we get a contribution 
if 3~ N j is in the left third of that interval. There are s N ^ K ~ 1 such values of j, thus (|45|) 
is equal to — 3 _3 S(7). 

If k is in the right third of a 3-adic interval of length 3~ K , then we get as many j in the 
left third as in the middle third, and their contributions cancel each other, thus ( f45| ) is 
equal to 0. 

Now summing (55) over all k < N reduces to counting the number of scales k for which 
k is in the middle third of a 3-adic interval of length 3~ K . We may restrict attention to 
k > because we assume k e [0, 1) and hence k is always in the left third of any 3-adic 
interval of length 3 _K if k < 0. 

Thus 

%(Q(F))(k) = -3- 3 3( 7 )#{0 < k < N : k^ = 1} . 

Since for each scale exactly one third (in measure) of all numbers in [0, 1) are in the middle 
interval of a 3-adic interval of that scale, we obtain 

r3(Q(F))(A;) = -(iV+l)3~ 4 3(7) . 
</o 

Moreover, clearly 

sup \%{Q(F)){k)\ = (iV + l)3- 3 3( 7 ) . 

fc€[0,l) 
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Thus 

\\{k : \%Q(F))(k)\ > (N + l^^m > 3- 2 . 
This proves [5.1| since the choice of N was arbitrary. ■ 

6. Appendix 

In this section we prove the formula (Q). The proof uses complex contour integration, 
which is essentially the only method we know to prove the inequality. It is a global 
argument, which should be contrasted to the argument in Section |2| for the Cantor group 
case, which uses local methods. The local methods are useful in proving Carleson's 
theorem, while global methods are hard to adapt. 

The argument in this section is well known, variants of it go back at least as far as 



the article by Buslaev and Faddeev |4| or the work by Verblunsky |16|, p7|1 in the discrete 
case. 

Let F be a compactly supported, complex valued, smooth function on R. Consider 
the solution to fl2|) with initial condition G(— oo) = id. Writing this initial value problem 
as an integral equation and using Picard iteration as in Section [5] gives the solution as a 
formal expansion 



Indeed, this expansion is easily seen to converge using that the L l norm of F is finite 
and a symmetry argument for the integration domain to obtain a factor 1/n! for the ra-th 
multilinear term. At x = oo we obtain for the first entry a(k) of G(k, oo) 

oo „ n 

a(k) = l + J2 II F(t 2j - 1 )Ffry)e 2ik <to- t *-J dt 2j ^dt 2j . 

This function a(k) extends holomorphically to k in the half plane Q(k) > because it is 
a summable superposition of functions of the form e lkt with t > 0. Assume for now that 
a(k) does not have any zeros in the closed upper half plane, we will prove this at the end 
of this section. 

Then we can define a function log(a(k)) in the upper half plane. We choose the branch 
of the logarithm so that for \k\ — > oo we have log(a(A;)) — > 0. It will become clear 
momentarily that this is well defined. 

We consider the counter clockwise contour integral over a large semicircle C = C\ + C 2 
where C\ = [— r, r] and C 2 = {k : \k\ = r,Im(k) > 0}. We show that on C 2 only the 
first nontrivial term in the expansion of a(k) gives a contribution to the integral. We do 
a partial integration for this term 

F^Fjtije 21 ^-^ dt 2 dh 

ti<t 2 



>o Jt 



2ik 



F{t + s)F{t)e 2iks dtds 

[ F(t)F\t)dt- ^- [ [F'(t + s)'F{t)e 2iks dtds 
Jt ^ik J s>0 J t 
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= + o(*-) • 

Here the estimate on the remainder term can be seen by one further partial integration. 

For all the other terms in the expansion we can do partial integration in all variables 
Sj = tij-\ — t 2 j so as to get the estimate 0{k~ 2 ) or even better for all these terms. Thus 
we have for large |fc| 

log(a(k)) = -±-\\F\\l + 0(k-*) . 
Doing the integration on C2 we obtain 

/ \og(a(k))dk = -^\\F\\l + 0(r- 1 ) . 
J c'2 

Since the contour integral over C vanishes, we obtain in the limit r — > 00 



/ \og(a(k))dk=^\\F\ 



2 

2 > 



which implies (|j) because the right hand side is real. 

It remains to prove that a does not have any zeros in the upper half plane. To this 
end write a(k, x) and b(k, x) for the entries in the first column of G(k, x) and consider the 
quantity 

\a{k,x)\ 2 \e~ 2ikx \ 2 - \b(k,x)\ 2 . (46) 

Writing dt(k) = o and Q(k) = r, the partial derivative in the x variable of this expression 
is 

23? [F(x)e 2iax e 2TX b(k,x)a(k,x) -F(x)e- 2iax e 2TX a(k,x)b(k,x)] + 2r\a(k, x)\ 2 e 2rx 

= 4T\a{k,x)\ 2 e 4TX . 
The latter is always positive for r > 0. Since ( [46] ) is equal to 

| e -2ifcx|2 

for x near —00, we conclude that (|46|) is positive for all x and all r > 0. This proves that 
a(x, k) is nonzero for such x and r. For r = we observe by a similar argument that 
\a(k, x) 1 2 — b(k, x)\ 2 is constant equal to 1 and thus a has no zeros on the real axis neither. 
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